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A detailed stochastic model of single-gene auto-regulation is established and its solutions are 
explored when mRNA dynamics is fast compared with protein dynamics and in the opposite regime. 
The model includes all the sources of randomness that are intrinsic to the auto-regulation process 
and it considers both transcriptional and post-transcriptional regulation. The timescale separation 
allows the derivation of analytic expressions for the equilibrium distributions of protein and mRNA. 
These distributions are generally well described in the continuous approximation, which is used to 
discuss the qualitative features of the protein equilibrium distributions as a function of the biological 
parameters in the fast mRNA regime. The performance of the timescale approximation is assessed 
by comparison with simulations of the full stochastic system, and a good quantitative agreement is 
found for a wide range of parameter values. We show that either unimodal or bimodal equilibrium 
protein distributions can arise, and we discuss the auto-regulation mechanisms associated with 
bimodality. 
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I. INTRODUCTION 

The role of stochasticity in cells and microorganisms 
has been discussed theoretically since the 1970s 12 [5]. 
Because cellular processes often rely on chemical reac- 
tions, and correspondingly on chance encounters between 
molecules or molecular complexes, stochastic effects due 
to small numbers are ubiquitous in the cell. In particu- 
lar, cellular decision processes, which are of of paramount 
importance as they allow cells to react to the internal 
and external media, are based on gene activation and 
regulation, often depending on random association and 
dissociation events. While many works focus on the lim- 
its imposed by stochasticity and the evolution of noise- 
minimization strategies [U |3H5] , there is a growing inter- 
est in possible functional roles of noise. Generically, the 
basic role of randomness in gene expression is to provide a 
natural means of generating phenotype variability across 
a population, enhancing its capacity to quickly adapt to 
fast-changing conditions. 

The evolution of experimental molecular biology tech- 
niques has made single-cell measurements possible and 
brought numerous confirmations of the presence of 
stochastic effects in gene expression [B], prompting a re- 
newed interest in the mechanisms underlying gene ex- 
pression and regulation in general, and specifically on 
the sources of randomness affecting them. The fact that 
genes coding for specific proteins are often present in sin- 
gle copies may introduce considerable noise. Further- 
more, mRNAs are commonly present in low copy num- 
bers, from a few to a few hundred molecules, and many 
proteins also exist in low number. Because transcription. 



translation and degradation events are stochastic, finite 
size fluctuations in mRNA and protein numbers become 
important. Stochastic effects may suffice to drive long ex- 
cursions of a gene's expression to higher or lower values, 
producing well-defined pulses in single cell protein abun- 
dances over time and/or multimodal protein expression 
distributions in a population. Fluctuations of the bio- 
logical parameters of the system under consideration are 
another source of randomness. For example, we charac- 
terize an active gene by a constant effective transcription 
rate, while this rate may depend on the presence of tran- 
scription factors whose concentration fluctuations induce 
fluctuations of the effective rate. Examples of theoretical 
approaches to these ideas can be found in [THS]. 

The recent development of single-molecule techniques 
led to the experimental identiflcation of another, more 
specific source of variability in gene expression that ac- 
counts for the heavy-tailed distributions often found 
in measures of population distributions of protein and 
mRNA abundance: both transcription and translation 
have been found, in many cases, to occur in time-localized 
bursts resulting in a geometrically distributed number of 
molecules, see (TUHH]. 

As experimental evidence of these sources of random- 
ness accumulates dH HJ] , the tools of statistical physics 
are being called upon for the development of a theoretical 
understanding of the underlying mechanics in noisy gene 
expression. Several models of the simplest elements of a 
gene regulatory network have been studied as stochastic 
processes that include a representation of some of these 
sources of randomness J, 16 19J. As expected, the sta- 
tionary solutions of these models may differ significantly 
from what one would obtain by simply adding a noise 
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term to the equations stemming from a deterministic de- 
scription. Moreover, the analytic solutions that can be 
obtained under certain assumptions were found to be in 
agreement with a wide set of experimental data fl3| . 

In this paper, we make use of these tools to study 
a bottom-up model for single-gene auto-regulation that 
includes all the sources of randomness that are intrin- 
sic to the auto-regulation process and is applicable in 
general to any protein species, auto-regulated by means 
either of transcriptional, as it is commonly considered, 
or post-transcriptional regulation. Analytic solutions of 
the general model obtained in two complementary ap- 
proximations for the relative timescales of protein and 
mRNA dynamics are discussed in terms of the qualita- 
tive features of the equilibrium protein distributions. The 
conditions for these approximations to hold are studied 
in some detail, and in their expected region of validity 
we find good quantitative agreement with the results of 
stochastic simulations of the full system. We use the ana- 
lytic solutions to discuss the conditions under which sin- 
gle gene auto-regulation gives rise to bimodal protein dis- 
tributions. Although these distributions are often associ- 
ated in the literature with the presence of more complex 
regulation mechanisms, we find that those conditions are 
quite general. 

The paper is organized as follows. In Section [iTj we 
establish the stochastic model. In Section [III] we present 
the solutions of the model for the protein and mRNA 
equilibrium distributions in the timescale separation ap- 
proximations, and we discuss the qualitative features of 
the former. In Section [IV[ we study the validity of the 
approximations and compare the approximate analytic 
solutions with the results of simulations. We conclude in 
Section[V] Six appendices contain technical details which 
are too cumbersome to include in the main text. 



For concreteness, we will consider regulation to be ef- 
fected by protein dimers, in agreement with experimental 
evidence for some particular proteins |5T]. Other choices, 
such as monomer binding [16^ or a general cooperative 
binding modeled by a Hill function [7] have been used in 
the literature. We assume the protein and mRNA pop- 
ulations to be non-interacting except for the fact that 
proteins dimerize prior to binding to the promoter. The 
timescale of promoter reactions (~ seconds) is assumed 
much shorter than that of the mRNA (~ minutes to 
hours) and protein (usually ~ hours), in agreement with 
data for the typical timescales of these processes, see for 
example [H] . Since regulation takes place at the DNA 
level, we need to model processes at three different levels: 
the promoter's (DNA), the mRNA's, and the protein's. 




FIG. 1: (Color online) Basic structure of the dynamics of 
a single protein that auto-regulates either (1) transcription- 
ally or (2) translationally. Arrows representing protein and 
mRNA degradation have been omitted. 



II. MODEL 

We study the cell- level dynamics, and corresponding 
population distributions, of a single protein capable of 
auto-regulation and its mRNA. Protein and mRNA con- 
centrations are controlled by the balance between pro- 
duction and degradation events. In transcriptional reg- 
ulation (see Figure [T] left arrow), the regulatory feed- 
back is mediated by binding of a molecule, whose con- 
centration depends on that of the protein itself, to the 
promoter region in the DNA to alter the transcription 
rate of its mRNA. This is the most commonly stud- 
ied mechanism of gene regulation, but other mechanisms 
have been reported in the recent literature that act post- 
transcription, at the mRNA rather than at the promoter 
level [50]. In this translational regulation scenario, the 
regulator molecule interacts with the mRNA to change its 
rate of protein production (see Figure [l] right arrow). In 
what follows we derive the Master Equations that govern 
protein and mRNA abundances in both these scenarios, 
starting with transcriptional regulation. 



A. DNA Level 

For promoter dynamics, we essentially follow |23j . 
adapted to a fully stochastic description. The promoter 
site is assumed to bind only one dimer molecule at a 
time. To avoid unnecessarily heavy notation, in what 
follows we assume a given value of protein copy number 
n in the cell; probabilities should accordingly be taken as 
conditional probabilities given n. Denote by P/ the free 
promoter state and by P\, the bound state of the pro- 
moter and a dimer. For each instant t, let p{Pf, t) be the 
probability of the promoter being free, and p{Pb,t) the 
probability of it being bound to a dimer. The evolution 
of the probability of the bound state is governed by the 
Master Equation 

piPb, t\j)^ J k+piPf, t\j)- k-piPt, t\j) , (1) 

where and k~ are the promoter site binding and un- 
binding rates, and j is the number of dimer molecules 
available for binding to the promoter. Since at all times 
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the promoter is either free or bound to a dimer, we have 
alsop{Pf,t\j)+piPb,t\j) = l. 

The number of dimers in the ceU as a function of pro- 
tein copy number is given in a rate equation description 
by (see for instance [24] or Appendix IaI): 



712 (n) 



(2) 



where a is a dimensionless parameter defined by a = 
^/V/{8kd), V is the cell's volume and kd is the ratio of 
the dimerization and undimerization rates. If there are 
712 {n) dimers in the cell, the equilibrium probability dis- 
tribution for the number j of dimers available through 
diffusion for binding to a promoter with characteristic 
volume Vp much smaller than the cell's volume V is given 

by m 



(3) 



where Vj (9) is the Poisson distribution of mean 6 (eval- 
uated at j), and A = Vp/V <C 1. When writing down 
([3]) we have taken into account that for typical values 
of protein (and protein dimer) diffusion coefficients and 
dimerization rates, see [3| 1251 [26] , dimer formation and 
dissociation within the small volume Vp occurs with neg- 
ligible probability compared to diffusion into and out of 
Vp. Note that A is typically very small, since promot- 
ers have linear dimensions in the nanometer range and 
cells in the micrometer range. As discussed for example 
in [27 , other transport mechanisms more efficient than 
three-dimensional diffusion must be at play that enable 
the promoter to gauge the actual number of molecules in 
the cell. Assuming that transport does not distinguish 
between dimers, and that the number of dimers does not 
influence the transport of a single dimer (essentially, that 
dimers are independent regarding transport, as is the case 
for diffusion), the distribution of dimers in Vp is binomial 
in general, with an "effective rate of volumes" parameter 
A. In the relevant limit A <C 1 we regain ([3|. 

We will now explicitly take into account that the 
promoter timescale is much shorter than the protein 
timescale by assuming that the distributions p{Pf,t), 
p{Pb,t), have time to reach equilibrium for each fixed 
value of the number of proteins. Using the equilibrium 
dimer number distribution, we have: 

p^'^{P) = J2p^'^{P\j)r,iXn,{n)) , (4) 

with P G {Pf,Pb}. Solving equation ([T]) in equilibrium 
(p = 0) and substituting in Q leads to 
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p^^(n|n) = ^^^7',(An2(n)) 



(5b) 



where we have introduced the dimensionless parameter 
k = k^ /k" and we now emphasize the dependence on 
protein copy number n. 



B. mRNA and Protein Levels 

The production of mRNA and protein molecules in the 
cell has been found, in many cases, to occur in sharp ge- 
ometrical bursts PUHTB] . Although the concept of bursts 
and the mechanisms underlying them are still open to dis- 
cussion, see for example [28, 29', a basic description stems 
from two simple ideas. First, if transcription/translation 
events are widely spaced compared to their duration, it 
is reasonable to speak of burst events. Second, the ge- 
ometric distribution relates to the number of consecu- 
tive "heads" in the throwing of a (generally biased) coin; 
thus, if during a burst event there is a fixed probabil- 
ity that another molecule will be produced, a geometri- 
cally distributed number of molecules results. A major 
achievement of this burst description is that the resulting 
predicted form of unregulated protein expression distri- 
butions [71 130 j is remarkably simple and fits an impressive 
number of experimental distributions measured for yeast 
populations |T3]. We adopt here an approach in which 
bursts are formulated in a stochastic framework both for 
transcription and for translation. 

Owing to the timescale separation between promoter 
and mRNA/protein dynamics, for transcriptional regula- 
tion the latter are described in chemical reaction notation 
by 
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P 





> Pmm , 



m + iJLpp 



(6) 



Here m is the mRNA and p is the protein, while n stands 
for protein copy number. / is the regulation function, 
such that: 



J>0 



pkj 



kj 



(7) 



Thus, is the transcription rate when the promoter 
is free, and pPm is the transcription rate when the pro- 
moter is bound to a dimer; the protein exhibits nega- 
tive auto-regulation (auto-inhibition) if p < 1, and posi- 
tive auto-regulation (auto-activation) if p > 1; /i^ is the 
mean transcriptional burst size. With the burst scenario 
in mind, the transcription rates above are to be inter- 
preted as the mean rates at which a transcription event 
takes place; this event is modeled as the instantaneous 
transcription of a certain number (drawn from a geomet- 
ric distribution) of mRNA molecules. We assume here 
that regulation affects only the base transcription rate, 
and not burst size. Finally, 5m is the mRNA degradation 
rate. Similar definitions stand for the protein parameters 
(with j3p the translation rate, interpreted as the rate at 
which a single mRNA molecule initiates an instantaneous 
translational burst, /ip the mean translational burst size, 
and 5p the protein degradation rate). 
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It is interesting to see that the timescale separation for 
promoter dynamics allows all details of regulation to be 
condensed in the regulation function. Different regula- 
tory dynamics affecting only the transcription rate and 
obeying the same timescale separation may be modeled 
in this framework simply by considering a different form 
of f{n). Note also that a useful approximation to the 
regulation function as defined by ([t]) exists if fc <C 1. If 
An2(n) is small, the low j terms of the sum will dominate; 
taylor expansion of the denominator to lowest order in kj 
(for kj ^ 1) and explicit calculation of the sum leads to: 



fin) 



1 + pkXn2{n) 
1 + kXn2{n) 



(8) 



If Xn2{n) is large, the large j terms dominate, and 
the approximation given by ([s]) remains valid because 
(I + pka) /(I + ka) « p for large a. Direct numerical cal- 
culation reveals that ([8| is a good approximation overall, 
even for moderately large values of fc < 1, see Figure [2] 




FIG. 2: (Color online) Aproximation ^ for the regulation 
function. We fixed A = 10^^, p = 10, a = 10^ for a typical 
example. Bottom curves: k = 10"^; top curves: k = 0.5. 



Let E,{0) 



be the geometric distribution of 



mean 9 (evaluated at i), conditioned to non-zero values 
i ^ 1 because a burst of zero molecules has no physical 
meaning 533]. Let also pj,n(t) be the joint probability 
distribution of mRNA and protein copy numbers (evalu- 
ated at mRNA copy number j and protein copy number 
n) at time t. Then the Master Equation for the process 
^ reads 

+ l3pjY,EMiEp' - l)+5p(Ep ~ l)nlpj,„(i). 



(9) 



where we have made use of the "step operators" E,„ , Ep 



defined by: 



Emffj,n(0 = 9]+t,n{t) 1 



(10) 



for any function g depending on mRNA copy number j, 
protein copy number n, and time t. With this notation, 
each term of the sum in the first term on the right hand 
side of ([9]) stands for the creation of i mRNA molecules 
through a burst of size i, with bursts of arbitrary size oc- 
curring at rate /?„/(«). The second term represents the 
degradation of one mRNA molecule, occurring at rate 
5mi- Each term of the sum in the third term on the right 
hand side stands for the creation of i protein molecules 
through a burst of size i, with bursts of arbitrary size 
occurring at rate f)pj. Finally, the last term in ^ repre- 
sents the degradation of one protein molecule, occurring 
at rate SpU. 



C. Translational Regulation 

Consider now the case of translational regulation. Fig- 
ure [1] right arrow. In this case we assume that mRNA 
production proceeds through bursts without protein reg- 
ulation, and that the rate of production of protein bursts 
in translation is modulated by a regulation function 
f{j,n) depending on mRNA and protein copy numbers 
and describing an interaction (direct or indirect) of the 
protein with its mRNA. Then, mRNA and protein dy- 
namics are described in chemical reaction notation by 







p„,m 



m !• TO -I- /ipp , 



(11) 



The Master Equation for the process (11 1 then reads 

PoAt) = [^m ^^(/^™)(Em'' - 1) + <5m(E™ - l)j 

+ E,{pp){E-' ~ l)f{j, n) (12) 

-I- Sp(Ep - l)n Pj^t) . 



III. APPROXIMATE SOLUTIONS FOR THE 
EQUILIBRIUM DISTRIBUTIONS 

In what follows, n and j always stand for protein and 
mRNA copy numbers, respectively. The coupling be- 
tween mRNA and protein reactions leads to correlations 
between the random variables corresponding to n and j. 
As a result, the joint distribution pj^n does not factorize 
and separate Master Equations for n and j do not exist. 
Studying the solutions of the Master Equations (12) 
in general calls for direct numerical simulations of the 
dynamics or numerical integration techniques. However, 
further timescale separations between mRNA and pro- 
tein dynamics may be explored to simplify the problem. 
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We say mRNA is fast (compared to protein) if we can 
write the joint equilibrium distribution as: 



Pn 



1j\nPn 



(13) 



where qj^^ is the equilibrium solution to the Master Equa- 
tion for mRNA with fixed n. This means that mRNA 
dynamics are fast enough for a large number of mRNA- 
only reactions to take place before an n-changing reac- 
tion occurs, so that gj|„ reaches equilibrium and the time 
spent out of equilibrium is negligible. Then, by substi- 
tuting (13) in the appropriate general Master Equation 
and summing over j, we obtain an equation for in- 
dependent of j. The physical idea is that for a certain n 
the mRNA will essentially sample the distribution gjj^^, 
and j-dependent quantities are correspondingly averaged 
over this distribution. 

Similarly, we say protein is fast if we may write 



P 



eq 



eq eq 



(14) 



with analogous interpretations. In this case, for each j 
the p^l^. distribution is sampled and rt-dependent quan- 
tities are averaged over it. 

We postpone to Section [TV] the analysis of the condi- 
tions under which such a separation holds as a good ap- 
proximation, and use it here to write down an equation 
for p"^ or g^' from which approximate analytic expres- 
sions for the stationary solutions of the Master Equations 
^ will be derived. 



A. Transcriptional Regulation Under Fast mRNA 
Dynamics 

In this section we consider transcriptional regulation 
for the case of fast mRNA compared to protein dynamics. 
We explore both the discrete scenario and a continuous 
approximation. It is convenient in this case to consider 
fixed n, since fast niRNA dynamics should allow mRNA 
copy number to equilibrate for each fixed protein copy 
number. This means we are considering the reactions 



(15) 



at fixed n. Let qj|„(t) be the distribution of mRNA copy 
number (evaluated at j) at time t, given n. The Master 
Equation for this process has the simple form: 



1j\r 



{t) ^\l3,^fin)Y,E^{fi.m){E-: - 1) 



(E„ - l)j\qj\„{t) 



(16) 



Let q^^ be the equilibrium distribution of mRNA copy 
number, for each protein copy number n. The mean value 



of mRNA corresponding to this distribution can be found 
to be (see Appendix [b]) 

(id)g^? = Mm7m/(n-) , (17) 

where = l3m/Sm, and id is the identity function. 
In the protein timescale, we have the reactions: 



fJ-pP 



(18) 



Let Pnit) be the distribution of protein copy number 
(evaluated at n) at time t. According to ( 13 ), the Master 
Equation for this process reads 



p^{t)=[Y,l3pE.{fip){E;'- l)jq''' 



Sp{Ep- l)n pn{t) 



]>0, 

i>l 



[r6pY,E^{fip){E-'- l)f{n)+ Sp{Ep~ 



1) 



Pnit), 



(19) 

where r = ^mlmlp and 7p = Pp/Sp. The parameter r 
is the prefactor of the average effective rate of transla- 
tion burst events scaled by the degradation rate of the 
protein, rf{n). We will see that, together with the aver- 
age translational burst size fj,p, it determines the protein 
equilibrium distribution in this approximation. 

As expected, when mRNA is fast, protein dynamics 
depends at each time only on the average mRNA corre- 
sponding to the available protein number n. Specifically, 
the translation rate becomes proportional to {idj^'^i, 

which is in turn proportional to f{n). Through this 
mechanism, promoter-level regulation gauges the num- 
ber of proteins present in the cell at a certain time. Note 
also that further details of mRNA dynamics, including 
burst-like production, are lost at the level of protein. 

Let us consider as well a continuous approximation of 
the dynamics. For this we take x = Xn as an "approxi- 
mately continuous" variable (recall that A <C 1). A con- 
tinuous Master Equation for the distribution p{x,t) of 
protein "concentration" x reads (see Appendix [C|) 

p{x, t) =rdp / f{y) E{x - y, flp) - Soix - y) p{y, t)dy 
Jq l j 

-I- Sp dx xp{x, t) , 

(20) 

where E{x,6) = {l/0)e is the exponential probabil- 
ity distribution of mean d evaluated at x and (5_d is the 
Dirac delta. For simplicity we have chosen to keep the 
symbol /, such that ](x) — f{n) for x = An. The expo- 
nential distribution term accounts for the contribution to 
p{x, t) due to bursts leading to concentration x, and the 
Dirac delta term accounts for bursts away from 
the rescaled burst size, flp = Xfip. The last term is due 
to protein degradation. 

The equilibrium solution of (20 1 can be found to be 
(see Appendix [P]) 



(21) 
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where the constant Ac depends on the arbitrary constant 
c and is determined by normaUzation. 



If we solve equation ( 19 1 directly in the discrete setting 



(see Appendix |E|, we find the solution 



eq 



n 



(22) 



for n ^ 1, with determined by normalization. 

Generically, the continuous approximations presented 
throughout this section are very accurate for burst sizes 
of order 10 and higher. It should be noted, however, that 
very sharp peaks (with a width of the order of a single 
molecule) that arise for zero protein or mRNA in some 
parameter ranges are not well captured by the continuous 
approximation. 

The role of the biological parameters in the qualita- 
tive features of the protein distribution is particularly 
clear in the continuous setting. To study some of these 
features, consider the derivative of the probability distri- 
bution given by (21); concentrations a; > where proba- 
bility peaks correspond to dxP'^'^{x) = 0, leading to 



rjlpf{x) ^ x + fip 



(23) 



Let us consider the regulation function as given by the 
approximation described by (|8]). In the continuous de- 
scription we write: 



fix) 



1 + pkx2{x) 
1 + kx2{x) 



with 



X2ix) = \n2(n) 



X 

2 



■\/a; + I 



(24) 



(25) 



and a = a/\/X. By noting that equation (23 1 is equiva- 
lent to a quartic equation z = ^/x -\- a? ^ iX is easy to 
prove that p^"^ is at most bimodal (see Appendix [F|). 

In the case of negative auto-regulation [p < 1), 
p'^'^ is always unimodal because the regulation function 
is monotonically decreasing. Positive auto-regulation 
(p > 1) is necessary for more structured distributions, 
and bimodal distributions do in fact arise for some pa- 
rameter sets. It is interesting to note that in the limit of 
weak dimerization (large a) p'^'^ is always unimodal, while 
in the limit of strong dimerization (small a) it is unimodal 
if 7 > 1 and bimodal with a peak at zero if 7 < 1; bi- 
modal distributions that do not peak at zero are present 
only for intermediate dimerization (see Figure |3]). Near 
parameter regions allowing for bimodality, the shape of 
p'^'' is also very sensitive to the promoter affinity k (see 
Figure |4]) . Varying r and p affects bimodality as well, 
but the values of these parameters have a stronger effect 
on the peak positions. Finally, the burst size parameter 
Pp also affects the position and relative size of peaks in 
p^"^, but its essential role is to produce the heavy tailed 
distributions commonly observed experimentally. 
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FIG. 3: (Color online) Illustration of the effect of varying 
the dimerization parameter a when bimodality is possible. 
For low dimerization (top) there is only a low concentration 
equilibrium, and for high dimerization (bottom) there is only 
a high concentration equilibrium. BimodaHty without a peak 
at zero arises only for intermediate dimerization (middle). 
Example parameters are r = 5, /ip = 0.9, p — 28, k = 10~^, 
and: Top: a — 50; middle: 5 = 5; bottom: 5 = 0. 




FIG. 4: (Color online) Illustration of the effect of varying 
promoter affinity k when bimodality is possible. We fixed 
r — 5, pp = 0.9, p — 28 and 5 = 5. 



It is now easy to obtain the distribution of mRNA ex- 
pression. For the continuous approximation, taking into 
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account the Master Equation (16), we have as in (20) 

q{z,t I x) = 6„,d^ zq{z,t \ x) 



+ Pmf{x) i E{z - W,flrn) - S{z - w) q{w,t \ x) dw. 
JO 

... (2^) 

This is an evolution equation for the distribution of 
a "continuous" mRNA concentration variable z = Xj, 
given a fixed protein concentration x = Xn (with flm 
again a rescaled burst size). Since / depends on protein 
but not mRNA concentration, we find for the equilibrium 
distribution (see Appendix O) a Gamma distribution: 



q'"^{z I x) = G{z,-i^f{x),jljn) 



(27) 



To find the equilibrium distribution of mRNA, we take 
the integral over all values of protein concentration, 
weighted by the respective probabilities given by (21): 



q^i{z) = / g^«(z I x)p'"i{x)dx 
Jo 



(28) 



Similarly, the solution for the discrete dynamics, cor- 



responding to equation (16), is given by a Negative Bi- 



nomial distribution (c.f. Appendix rEl) : 



(29) 



The discrete equilibrium distribution for mRNA is found 
in this case by summing over all n, weighing with the 
discrete protein distribution given by (22 1: 



eq _ eq eq 



n>0 



^ N. 



1 



(30) 



The performance of the continuous approximation is sim- 
ilar for mRNA and for protein. 



B. Transcriptional Regulation Under Fast Protein 
Dynamics 

It is now convenient to consider fixed j, since in this 
case protein dynamics is much faster and will equilibrate. 
Let Pn\j{t) be the distribution of protein copy number 
(evaluated at n) at time t, given j. We have again reac- 
tions ( 18 ), but in this case we write the Master Equation 



for fixed mRNA copy number j: 



Pn\oit) = [/3pj^i;,(/Xp)(Ep'-l) +,5p(Ep-l)n]p„|j(i). 

(31) 



In the continuous approximation, we find: 



p{x, t\z)= %Sp / E{x - y, jlp) -5d{x~ y) p{y, t) dy 
Jo '- 



Spdx 



xp{x, t) 



(32) 

where 7p = 7p/A. This equation can be solved for the 
equilibrium distribution in exactly the same way as equa- 
tion (26), yielding: 



p'"'(x I z) = G{x,%z,jlp) 



(33) 



Similarly, the discrete solution (equation (31 1 ) is: 

1 



Pn\ 



J \ A^p - 1 ^ip 



(34) 



where Nn{0, ■) = Sno, with 5„ g a Kronecker delta sym- 
bol. 

Following arguments similar to those leading to equa- 
tion (19), the Master Equation for mRNA reads in this 
case: 



iAt) = [/3™5]i?»(Mm)(E™ - l)(/)p^^ 



(35) 



+ 6raiEm-l)j\qjit) , 

and the corresponding continuous Master Equation is 
g(z,t) = d,ndz zq{z,t) + 

+ Pm I {f)p-i(\w) E{z-w,jj.jn)-SD{z - w) q{w,t)dw. 
Jo '- 

(36) 



The equilibrium solution of equation ( 36 1 can be found 



through the same method as the one used for equa- 
tion (20), yielding: 



(37) 



where is again a normalization constant. The discrete 
solution, for equation (35), is 



eq 1m Qq 
1j = 



1 



1=1 



(38) 



for j ^ 1, with ^q"^ determined by normalization (note 
that (/)pe, ^ /(O) ^ 1). 

In the continuous approximation, the distribution of 
protein concentration follows immediately from the in- 
tegration of the conditional distribution given by equa- 
tion (33): 



p«9(x) = / p^^x I z)q^'i{z)dz 
Jo 

= {G{x,%id,ilp)) . 



(39) 
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The corresponding discrete distribution is: 



eg \ ^ eg eg 



j>0 



= (iV„ 



-7p id, — 
1 fip 



(40) 



As expected, in this timescale regime the role of the 
regulation function is confined to the level of mRNA. The 
protein distribution depends only on the niRNA distri- 
bution, plus translation rate and protein burst size. 



C. Translational Regulation 

In this scenario, mRNA production takes place 
through bursts without protein regulation and so mRNA 
reaches equilibrium independently of protein concentra- 
tions. Formally, mRNA dynamics decouples from the 



general Master Equation (12), yielding for the mRNA 



distribution qj{t) the Master Equation: 

(41) 

The equilibrium solution for an unregulated process of 
this type, see Appendix [E} is a Negative Binomial: 



TT'^" ■ 

Mm ^ Mr; 



(42) 



whose average is 7mMin- 

In the fast mRNA dynamics approximation, the Mas- 
ter Equation for protein abundances reads 



Pn{t)= [Y,f3pE^i^^p)i^^- l)(jd/(-,"))g- 



(43) 



+ 5p{Ep- l)njp„(t), 

which, with the simple regulation function /(j, n) = 
f{n), reduces to 

(44) 

For a general regulation function f{j, n), (44) still holds, 

where now f{n) = (i(i/(-, n)),=,/(7„Aim)- 

Equation ( 44 ) is the same equation that describes the 



distribution of protein with transcriptional regulation un- 
der fast niRNA dynamics (compare to equation (19l) 



We thus see that the protein equilibrium distribution is 
the same that was found in Subscction |III A[ with the ap- 
propriate interpretation of the new regulation function /. 



Moreover, as we will see in Section IV this solution holds 



under less stringent conditions than that of equation ( 19 1 



Finally let us consider the fast protein approximation 
in the translational regulation scenario. By the same 



arguments of Subsection |IIIB[ the equilibrium protein 

(45) 



distribution will be given by 



„eg 



Pn\j1j ' 
J>0 



where gj'' is the Negative Binomial ( 42 ) and p'^y is the 



equilibrium distribution for fixed mRNA copy number j. 
The latter is the stationary solution to 



(46) 



Sp{Ep~ l)n Pn\j{t) 



already found in Subsection III A (see ( |19[ ), (22)) to be 
given by 



Pr 



JlpPo 



\3 



U np 



-I- 



i=l 



(47) 



or, in the continuous approximation for n (see (21)), by 

For the purpose of comparison of this approximation 
with simulational results, we will use the analytic solution 



tion for p'^^y given by ( 48 ) . 



given by ( 45 ) with ( 42 ) and the continuous approxima- 



IV. VALIDITY OF THE TIMESCALE 
SEPARATION APPROXIMATIONS 

In this section we study the conditions under which 
the timescale separation assumptions used in Section III 
should hold approximately. We illustrate the quality of 
the approximate analytic solutions for protein by com- 
paring them with the results of simulations of the full 
stochastic process described by the Master Equations (|9| , 
(12) using the Gillespie algorithm [31]. In order to il- 



lustrate the agreement with the analytic distributions, 
the simulation curves shown below were plotted for sam- 
pling sizes such that the error bars at each data point 
are smaller than the markers. We have checked that the 
structure of the curves obtained from these simulations 
is robust down to order 10^ independent samples. 

Let the subscripts / and s refer to the fast and slow 
species respectively, and let a and a be the mean and 
standard deviation of the equilibrium distribution, re- 
spectively. Consider also the average times T for a change 
of one molecule to occur. Two conditions must be met: 

1. The fast species must approach equilibrium quickly 
compared to Tg. If a change of the slow species 
produces a change of absolute value Aa/ in the 
equilibrium average of the fast species, we must 
have: 



rpUp 



Aaj < 1 



(49) 
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where T^^ is the average time for the production 
of one copy of the fast species, since it is straight- 
forward to check for each case that re-equihbrating 
following a burst is the most demanding scenario. 

The fast species must accurately sample the equi- 
librium average within a time interval . The rela- 
tive standard error of the mean for N uncorrelated 
samples from a distribution of mean a and stan- 
dard deviation a is given by: 



a 



aVN 



(50) 



When the fast species dynamically samples the 
equilibrium distribution, two uncorrelated mea- 
surements will be spaced in time approximately by 
the correlation time Tf = 1/5 f. Thus, consider- 
ing the relative error in an interval Ts, we find the 
condition: 



< 1 



(51) 



We now study the constraints imposed by applying 
conditions 1. and 2. self-consistently in the hypothesis of 
fast mRNA and fast protein, with both transcriptional 
and translational regulation. For transcriptional regula- 
tion under fast mRNA we have 



After a protein event leading to n we may write: 

-1 



rpUp 



Aaf = ^Imlni^f , 



(52) 



(53) 



where A/ is the absolute value of the change in the value 
of / associated with the protein event. Since production 
and degradation reactions must be balanced in equilib- 
rium, we may estimate Ts for macroscopically occupied 
n as: 



Then, setting S — Sp/dm wc find for condition 1. 
S^iprAf < 1 , 

and for condition 2. 



We may combine the two conditions and write: 



(54) 



(55) 



(56) 



(57) 



Note that A/ is bounded by — 1|. It should be men- 
tioned that in the interpretation of a protein transla- 
tion burst as the protein production of a single mRNA 



molecule along its whole lifetime, /3p = ^rn(Mp ~ 1)/a*p 
(see [33]), 5r^p = {^p — l)yLtm7m and the conditions of 
the fast mRNA approximation may be met only for very 
low protein burst sizes, even in the absence of regulation. 

In Figure [5] we plot the equilibrium protein distribu- 
tion obtained from simulations of the stochastic process 
([9]), together with the analytic solution (22). The pa- 
rameter values were chosen taking into account condition 
(57). There is excellent quantitative agreement between 



approximate and exact solutions. 
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FIG. 5: (Color online) Illustration of the fast mRNA approxi- 
mation with transcriptional regulation. Parameters are r = 5, 
= 90, 7„ = 2.25 ■ 10^ p.^ = 2, 5 = 10"^ p = 28, A: = 0.1, 
a = 50 and A = lO"'^. Error bars are smaller than markers. 
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FIG. 6: (Color online) Illustration of the fast protein ap- 
proximation with transcriptional regulation. Parameters are 
7p = 3, /ip = 20, 7™ = 3, /i^ = 20, & = 10^, p = 7.5, k = 0.25, 
a = 200 and A = 10~^. Error bars are smaller than markers. 



For transcriptional regulation under fast protein we 
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have, after an mRNA event leading to j: 
af = 

rpUp 




(58) 



Since the variation of j due to a burst is of order /i^, this 
leads to: 

t^m/S « 1 (59) 
for condition 1., and for condition 2. we find: 

(60) 

The combined condition is: 



(61) 



In Figure|6]we illustrate the behavior of the analytic so- 
lution given by ( [40| ) versus simulations of the full stochas- 
tic process ^ . The parameter values were chosen taking 
into account condition (61), and once again there is ex- 



cellent quantitative agreement. 

Consider now the case of translational regulation. The 
fast mRNA approximation may be treated in much the 
same way as the corresponding transcriptional regulation 
case. Note however that the mRNA-only reactions now 
decouple, and the equilibrium solution for the mRNA 
distribution does not depend on n. Thus Aay = 0, and 
condition 1. imposes no constraints. The resulting con- 
straint is due to condition 2. only and becomes: 




« 1 , 



(62) 



to be considered for macroscopically-occupied values of 
n. Recall that f{n) is bounded by max(l,p). Figure [t] 
again shows excellent quantitative agreement between 
the analytic solution derived in Subsection |III C| for fast 
mRNA dynamics and the equilibrium distributions ob- 



tained from simulations of (12 1. 

Notice the similarity between Figure [7] and Figure [5] 
which is due to the fact that we have considered for the 
translational regulation function f{j,n) = /(n), with 
f{n) the transcriptional regulation function used for the 
results of Figure [5) Notice also that that the ratio S 
between protein and mRNA decay rates is far larger in 
the case of Figure [7] illustrating that, in terms of the 
relative stability of protein and mRNA, the fast mRNA 
approximation is much less demanding for translational 
regulation than for transcriptional regulation. 

For the case of translational regulation in the fast pro- 
tein approximation, the distribution of the fast species at 
fixed j is more structured, and may be bimodal in gen- 
eral. Furthermore, the dependence of peak positions for 
the protein distribution on j is also more complicated. A 



calculation of the parameter constraints imposed by con- 
ditions 1. and 2. in this scenario would not lead to simple 
estimates such as the ones found in the previous three 
cases. However, the arguments and calculations above 
provide the intuition that the separation regime will be 
reached for a certain set of parameters if S is made large 
enough, as shown in Figure |8] This figure also illustrates 
the good performance, despite the pronounced peak at 
low X, of the continuous approximation, which was used 
to plot the analytic curve, see Subsection |III C| In the 
preceding figures the approximation for continuous n is 
also very good (data not shown). 
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FIG. 7: (Color online) Illustration of the fast mRNA approxi- 
mation with translational regulation. We took /(j, n) = /(n), 
and parameters are r = 5, fip = 90, 7m = 6.3 • 10"*, fim ~ 2, 
5 = 1, p = 28, fc = 0.1, a = 50 and A = 10"^. Error bars are 
smaller than markers. 
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FIG. 8: (Color online) Illustration of the fast protein approxi- 
mation with translational regulation. We took f{j, n) = fin), 
and parameters are 7p = 0.25, = 90, 7m = 10, Hm = 2, 



5 = 10^ p = 28, fc = 0.1, 
are smaller than markers. 
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V. DISCUSSION AND CONCLUSIONS 

In this paper we have established a detailed stochastic 
model of single-gene auto-regulation and explored its so- 
lutions when mRNA dynamics is fast compared with pro- 
tein dynamics and in the opposite regime. The timescale 
separation allows the derivation of analytic closed form 
expressions for the equilibrium distributions of protein 
and mRNA. Except for very small number of molecules, 
these distributions are well described in the continuous 
approximation, which we discuss in detail. We typically 
find distributions that differ significantly from gaussian 
distributions and exhibit heavy tails. This is the effect of 
an essential ingredient of the model, the transcriptional 
and translational bursts, which typically have a magni- 
tude comparable to system size. The continuous approx- 
imation is well suited to the description of the qualita- 
tive features of the protein equilibrium distributions as 
a function of the biological parameters for fast mRNA. 
In particular, we find that for positive auto-regulation 
and intermediate values of the dimerization parameter 
a the protein equilibrium distributions are bimodal with 
two non-zero peaks in a significant range of the remaining 
parameters. In more general terms, our results show that 
a fully stochastic description of single-gene positive auto- 
regulation generates structured protein distributions that 
otherwise can only be explained in the framework of more 
complex gene regulatory networks. 

We discussed the conditions under which the timescale 
separations hold in good approximation and illustrated 
the performance of both regimes for transcriptional and 
for translational regulation by comparison with simula- 
tions. We found a broad range of parameter values where 
one of the two opposite timescale regimes provides a very 
good approximation. In this range, statistical measures 
such as mean and variance commonly used in the biologi- 
cal literature to characterize experimental results on pro- 
tein and mRNA abundances in ensembles of cells can be 
readily computed from the analytic equilibrium distribu- 
tions derived in Section [Till However, mean and variance 
fall short of characterizing distributions that can be uni- 
modal or bimodal, and non-uniformly heavy tailed. For 
the purpose of comparing the results of the model with 
real data, it is best to consider the full analytic distribu- 
tions. 

Evidence of translational regulation reported in the bi- 
ological literature raises the question of understanding its 
role in the context of stochastic gene expression. Recent 
work has shown how different post-transcriptional regula- 
tion mechanisms modulate noise in protein distributions 
[32j . Here we have shown that the equilibrium protein 
distributions for translational regulation have the same 
form as those that arise under transcriptional regulation 
in the case of fast mRNA. In particular, the structured 
protein distributions produced by transcriptional auto- 
regulation with fast mRNA are also produced by trans- 
lational auto-regulation under less demanding conditions 
in terms of protein and mRNA relative stability. On 



the other hand, we have shown that in the translational 
regulation scenario these structured protein distributions 
are often found as equilibrium solutions also for fast pro- 
tein dynamics. These properties suggest for translational 
regulation an additional biological rationale: it allows for 
efficient auto-regulation, circumventing mRNA stability. 
This idea concurs with the analysis of [22] based on ex- 
perimental data for protein-mRNA lifetime pairs. 

Transcriptional and translational bursts are an essen- 
tial ingredient of the model whose possible underlying 
mechanisms and statistics are currently being discussed 
in the literature. Throughout the paper, we assumed the 
simplest form for these bursts. Extending these results, 
in particular the validity of the timescale separation ap- 
proximation, to the case of more complex mRNA and 
protein production statistics, see [22], will be the subject 
of future work. 
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Appendix A: Dimer Dynamics 

Consider a cell of volume V where there are n copies 
of some molecular species that can be characterized by a 
dimerization rate fcj" (dimensions volume.time"^) and an 
undimerization rate fc^ (dimensions time^^). Our goal 
here is to find the explicit form of n2{n), the number 
of dimers as a function of (fixed) total copy number n. 
The equations governing dimerization dynamics of this 
species at fixed total density (j) = n/V are: 



4>f + 202 



(Ala) 
(Alb) 



Equation (Ala) is the rate equation for temporal dynam- 
ics, and the conservation equation ( Alb[ ) refiects that 
molecules are either free (0/ = Uf/V) or bound in pairs 
as dimers (02 = n2/V). 

Defining kd = l^d ^ equation ( jAlaj ) yields in equi- 
librium: 



4>2 = kd4>} 



(A2) 
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Using equation (Alb) for <j)f leads, in terms of copy num- 
ber, to the desired result: 



n2\n) ^ -+a -a 



\/ n + a? 



(A3) 



where a is a dimcnsionless parameter defined by a = 
A/(8fcd). ^ 

It is also interesting to note that there arc two limits 
in which ( A3 ) becomes very simple and intuitive. One 



the one hand, if a <C n, we find: 
n2{n) « - 



(A4) 



In physical terms, this can be understood as follows: for 
a certain density n/V , if kd is high enough most proteins 
will bind in dimers; conversely, for a certain kd^ if density 
is high enough most proteins will again be bound because 
of increased collision probability. On the other hand, if 
3> we are in the opposite limit where most proteins 
will be free. Taylor expansion of the square root leads in 
lowest order to: 



712 (n) 



kd 2 



(A5) 



This result can also be found by setting in (|A2 



Appendix B: Mean mRNA in Equilibrium (Fast 
mRNA) 



Consider the mRNA master equation (16 1. Multiply- 



ing both sides by j and summing over j we find an equa- 
tion for the mean: 



5t(*d),|„(t) =[/3™/(n)^ii;,(Ai„)^j-(Ern - 1) 

+ j(E„i - l)jj<7i|n(i) ■ 

J>0 



(Bl) 



Let us compute (omitting the arguments t, n for sim- 
plicity): 



j>0, 



— Mm ; 

(B2) 

where we have made use of the fact that qj = whenever 
copy number j is negative. Now let us look at: 



i^o j>o j>o 

= (B3) 



Since we are looking for the equilibrium mean we now 
set the left-hand side of (Bl) to zero, and using re- 



sults (B2| and (B3) we find the desired result: 



(B4) 



Appendix C: Continuous Approximation 

Here we study a continuous approximation for equa- 
tions of the form: 

(CI) 

where / is some function of (protein or mRNA) copy 
number n, r ^ and i5 7^ are constants, and the step 
operator E raises n. For some time t, let copy number n 
be fixed, and let a; = An be the corresponding concentra- 
tion. In accordance with the main text, the convention 
f{n) = f{x) will be used. First, note that a reasonable 
definition for the continuous distribution obeys: 



Pr,{t) = p(x, t)X \{n + 1/2) - {n - 1/2) 
fn+1/2 

p{x,t) dx = \p{x,t) . 

n-1/2 



(C2) 



Now consider the conditioned geometric distribution. 
We have: 



1 



-nlog(l-l/p) 



(C3) 



If we take S> 1 (which is biologically common, espe- 
cially for proteins, see for example [HlllS]) and expand 
log(l — l/^i) around l//i = we find to lowest order: 

Er,{^J) ~ -e-"/^ = A-e-^/'^ = A£;(x,m) , (C4) 

with jl = A/z. Now notice that, apart from constant 
coefhcients, the creation term in equation (CI) may be 
written: 

^i?,(Ai)(E-^-l)/(n)p„(t) 

= ^ E^{^)f{n - i)pn-t - f{n)pn{t) ^^^^ 

n 

= Y i^n-iifJ.) - ^n,i) f{i)Pi , 



i=0 



where 6n.i is a Kronecker Delta symbol. Note that the 
upper limit of the sum can be extended to infinity by 
taking Ej{fi) = for j ^ 0, and the lower limit can be 
extended to negative infinity since pi = for z < 0. 
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The Kronecker Delta term reads: 

n 



(C6) 



A / Soix - y)f{y)p{y,t)dy 
Jo 



where 5d is the Dirac Delta. Notice that, for a meaning- 
ful conversion to the continuous case, the lower limit of 
the integral must be strictly included (in order to encom- 
pass the contribution of the Delta function). Thus, the 
upper and lower limits of the integral may be extended 
to infinity. 

For the conditioned geometric distribution term in 



(C5) we may write: 



i=0 



A / E{x " y,fj.)f{y)p{y,t)dy 



i=0 



(C7) 

where again the upper and lower limits of the integral 
may be extended do plus and minus infinity by consider- 
ing, respectively, E{y, p,) — and p{y, t) — for negative 
y. Here, the approximations 3> 1 (approximating the 
conditioned geometric distribution with an exponential 
distribution) and A ^ 1 (approximating the sum with an 
integral, i.e. considering x continuous) have been explic- 
itly used. 



Finally, the degradation term in equation (CI I reads, 
apart from a factor of S: 



(E - l)npn(t) = (n + l)p„+i(t) - npn{t) 
1 



{x + X)\p{x + X){t) - xXp{x,t) 

A L 

w \dx{xp{x,t)) , 

(C8) 

where we again make use of A <C 1 to approximate a finite 
difference with a derivative. Noting that Pn{t) — Ap(x, t) 
and collecting terms we find: 

p{x, t) =rS / /(y) E{x -y,fl)- 6d [x - y) p{y, t) dy 



5dx 



xp{x, t) 



(C9) 



we may write: 

p{x,t) = rS{w{fl) * fp){x,t) +5dx xp{x,t) 



(D2) 



where * is a convolution product. In equilibrium we have: 



a. 



xp'"^{x) ^ r{w{fi) * fp'"^){x) 



(D3) 



Laplace transformation of this equation leads to: 

s dsp{s)^rw{s)£{fp"^){s) 

= rw{s){f *p){s) 
s 



(D4) 



Here, g{s) = C{g){s) = 



1/11 



+ 00 



~(/*p)(s) 



~'g{x) dx (integration 



limit strictly included) is the Laplace transform of func- 
tion g (evaluated at s), and p = C{p'^^). Convolution the- 
orems have been used in the first and second lines, and in 
the third line the explicit form of w(s) was substituted. 
Rearranging terms we have: 

(s + l//i)p(s) = -r(/*p)(s) , (D5) 

which inverse-transforms to: 

d.^xp^'^ix)] = {rf{x)/x-l/fL)xp^'^{x) . (D6) 

This equation can easily be solved, leading to: 



(D7) 



The constant Ac is determined by normalization (de- 
pending on the arbitrary integration limit c). 

Consider now the case f{x) = 1, for all x. Solving the 
integral in (21 ) and normalizing the probability distribu- 



tion to integral unity we find: 

G{x,rji) . 



P^'ix) 



(D8) 



This is the Gamma distribution of parameters r and // (F 
is the Euler Gamma function). With r = fj-mlmlp and /2 
the mean rescaled protein burst size (with definitions ac- 
cording to the main text), this is the equilibrium solution 
for unregulated protein dynamics with fast mRNA. 



Appendix D: Continuous Equilibrium Distributions 

Here we follow [7] to obtain an analytical solution to 
equation (C9). As discussed in Appendix [C| the upper 
and lower integration limits may be extended to plus and 
minus infinity, respectively. Thus, defining: 



w{x , (1) = E{x, jl) - 5d{x) 



(Dl) 



Appendix E: Discrete Equilibrium Distributions 

In this Appendix we analyze, directly in the discrete 



setting, equation (CI). Analogously to the continuous 



case, the discrete Master Equation may be written: 
Pn{t) = rS{w{^) * fp)(n) +5 {n + l)p„+i(t) - np„(t) 



(El) 
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where * is now the discrete convolution product, and 
w(n,iJ,) = Enip) — Snfl. We now follow the procedures 
of Appendix [D] using the Z transform instead of the 
Laplace transform, g{s) = Z{g){s) = X^n:^ "sC"-)' 
Zijp'^'i) — p. The corresponding equation in "momentum 
space" is: 



(E2) 



s{s - l)dsp{s) + -dsp{s) = -r (^/ * pj {s) 

Inverse-transforming, we get: 

(n + l)pl\, + (1/m - l)np,7 = rf{n)pl^ , 
leading to the recurrence relation: 



r pT^rf{Q)pl\ 

\ (n + lK+i=(r^4 

This is easily solved, yielding: 



(E3) 



(E4) 



(E5) 



for all n ^ 1, with Pq'' determined by normalization 
(and the standard convention that the product equals 
one when the upper limit is smaller than the lower). Note 
that if / is a regulation function as per the main text we 
have /(O) = 1, since the promoter is necessarily free when 
no protein is present. 



Consider now the case f{n) — 1 for all n. Write (E5) 



rfil 



/i — 1 n 



n 



ji ri — 1 

n 



1 



(E6) 

with r' — r^/{id — 1). The product can be solved ex- 
plicitly is terms of Gamma functions, and normalizing to 
unit sum we find: 



1 ffi-i 
1 



Nn ( r', 



T{n + r') 
T{r')T{n + l) 



(E7) 



This is the Negative Binomial distribution of parameters 
7' and 1/^. The parameters are defined such that: 



^„(A;,p)=/(l-p)" 



+ fc- 1 
k - 1 



(E8) 



As in the continuous case (Appendix |D]), with 
f = fJ"m"fm"fp and ^ the mean protein burst size fip 
(definitions according to the main text), this is the 
discrete solution for unregulated protein dynamics with 
fast mRNA (as reported for example in [3D]). 



Appendix F: Bimodal Equilibrium Protein 
Distributions 



Consider the continuous equilibrium distribution for 
protein with fast mRNA, given by (21). The derivative 



of this probability distribution is given by: 



dxP^^x) = rjlf{x) -(x + fl) 



p'^'^ix) 
fix 



(Fl) 



If p^"^ peaks at zero (i.e. if dxP^'^{0) < 0), the term in 
brackets in equation ( |F1| must be negative at zero. Be- 
cause p'^'^{x) > for all X > 0, other extrema of p^'^ must 
satisfy: 



rp-fix) - {x + fl) = . 



(F2) 



Consider f{x) as given by (Is]). A change of variables 



to z = X + a'^ in equation ( F2 1 leads to an equivalent 
quartic equation, 

P4(z) = -z'' + 2az^ + a2Z^ + aiz + ao = 0, (F3) 



where the ai are real constants determined by the bio- 
logical parameters. The equation P'^{z) = is quadratic 
in z and has the two solutions: 




(F4) 



If they are real, one of these solutions necessarily obeys 
z < a. Therefore, P^iz) has at most one root in z > a. 

We now proceed to prove that p'^'' is at most bimodal. 
Since zeros of P4 correspond alternately to maxima and 
minima of p^*, the presence of more than two maxima 
requires at least four positive roots of Pi{z) in z > a. 
But then P^iz) would have at least two roots in z > a. 
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Note that, if E^{6) is the non-conditioned geometrical 
distribution of mean 6, the relation Ei{0) = g^El{6 — l) 
holds for all i ^ 1. Thus, "conditioned bursting" with 
frequency a and mean 6 is equivalent to "non-conditioned 
bursting" with frequency g^jQ and mean 9 — 1, and the 
difference becomes relevant only for 6 ~ 1. 



